1 

The Shannon Cipher System with a Guessing 
Wiretapper: General Sources 

Manjesh Kumar Hanawal and Rajesh Sundaresan 



Abstract 

The Shannon cipher system is studied in the context of general sources using a notion of computational secrecy 
introduced by Merhav & Arikan. Bounds are derived on limiting exponents of guessing moments for general sources. 
The bounds are shown to be tight for iid, Markov, and unifilar sources, thus recovering some known results. A 
close relationship between error exponents and correct decoding exponents for fixed rate source compression on 
the one hand and exponents for guessing moments on the other hand is established. 

wo; 

Index Terms 

<. 

cipher systems, correct decoding exponent, error exponent, information spectrum, key rate, length function, 
large deviations, secrecy, sources with memory, fixed-rate source coding 

I. INTRODUCTION 

^ ; We consider the classical cipher system of Shannon |0Q. Let X n = (X 1 , ■ ■ ■ , X n ) be a message where 
each letter takes values on a finite set X. This message should be communicated securely from a transmitter 
to a receiver, both of which have access to a common secure key U k of k purely random bits independent 
of X n . The transmitter computes the cryptogram Y = f n (X n ,U k ) and sends it to the receiver over a 
<3> public channel. The cryptogram may be of variable length. The encryption function /„ is invertible for 
F*- • any fixed U k . The receiver, knowing Y and U k , computes X n = f^(Y, U k ). The functions f n and f~ l 
^ ! are published. A wiretapping attacker has access to the cryptogram Y, knows f n and /~\ and attempts 
[ to identify X n without knowledge of U k . The attacker can use knowledge of the statistics of X n . We 
O ' assume that the attacker has a test mechanism that tells him whether a guess X n is correct or not. For 
example, the attacker may wish to attack an encrypted password or personal information to gain access to, 
7"! \ say, a computer account, or a bank account via internet, or a classified database [2J. In these situations, 
\ successful entry into the system provides the natural test mechanism. We assume that the attacker is 
^ ■ allowed an unlimited number of guesses. The key rate for the cipher system is R = fc(ln2)/n nats3 of 
^ . secrecy per message (or source) letter. 

Merhav & Arikan studied discrete memoryless sources (DMS) in the above setting and characterized 
the best attainable moments of the number of guesses required by an attacker. In particular, they showed 
that for a DMS with the governing single letter PMF P on X, the value of the optimal exponent for the 
pth moment (p > 0) is given by 

E(R,p) =ma 1 x{pmm{H(Q),R} - D(Q || P)} . (1) 

Q 

The maximization is over all PMFs Q on X, H(Q) is the Shannon entropy of Q, and D(Q \\ P) is the 
Kullback-Leibler divergence between Q and P. They also showed that E(R, p) increases linearly in R for 

This work was supported by the Defence Research and Development Organisation, Ministry of Defence, Government of India, under the 
DRDO-IISc Programme on Advanced Research in Mathematical Engineering, and by the University Grants Commission under Grant Part 
(2B) UGC-CAS-(Ph.IV). 

The material in this paper was presented in part at the IISc Centenary Conference on Managing Complexity in a Distributed World, 
(MCDES 2008) held in Bangalore, India, May 2008. A part of this work was presented at the IEEE International Symposium on Information 
Theory (ISIT 2009) held in Seoul, Korea, June 2009. 

'We shall mostly use nat as the unit of information in this paper by taking natural logarithms. fe(ln2)/n nats per input symbol is the 
same as k/n bits per input symbol. 
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R < H(P), continues to increase in a concave fashion for R G [H(P), H'], where H is a threshold, and 
is constant for R > H . Unlike the classical equivocation rate analysis, atypical sequences do affect the 
behavior of E(R,p) for R G [H(P),H] and perfect secrecy is obtained, i.e., cryptogram is uncorrelated 
with the message, only for R > H > H(P). Merhav & Arikan also determined the best achievable 
performance based on the probability of a large deviation in the number of guesses, and showed that it 
equals the Legendre-Fenchel transform of E(R,p) as a function of p. Sundaresan [0 extended the above 
results to unifilar sources. Hayashi & Yamamoto (H proved coding theorems for the Shannon cipher 
system with correlated outputs (X n , Z n ) where the wiretapper is interested in X n while the receiver in 
Z n . 

In this paper, we extend Merhav & Arikan's notion of computational secrecy [2] to general sources. 
One motivation is that secret messages typically come from the natural languages which are modeled 
well as sources with memory, for e.g., a Markov source of appropriate order. Another motivation is that 
the study of general sources clearly brings out the connection between guessing and compression, as 
discussed next. 

As with other studies of general sources, information spectrum plays crucial role in this paper. We 
show that E(R, p) is closely related to (a) the error exponent of a rate-i? source code, and (b) the correct 
decoding exponent of a rate-i? source code, when exponentiated probabilities are considered (see Sec. 
IIII-B2I) . In particular, the exponents in (a) and (b) appear in the first and second terms below when we 
rewrite E(R,p) for a DMS as 



E(R, p) = max <^ pR - min D(Q || P), 

I Q:H(Q)>R 

min {pH{Q) - D{Q || P)} 

This brings out the fundamental connection between source coding exponents and key-rate constrained 
guessing exponents. Further, unlike the case for the probability of a large deviation in the number of 
guesses (21 Sec. V], both the error exponent and the correct decoding exponent determine E(R,p). We 
extend the above result to general sources by getting upper and lower bounds on E(R, p). We then show 
that these are tight for DMS, Markov and unifilar sources. The bounds may be of interest even if they 
are not tight because the upper bound specifies the amount of effort need by an attacker and the lower 
bound specifies the secrecy strength of the cryptosystem to a designer. 

The limiting case as p j. in (b) yields classical framework for probability of correct decoding. This 
special case is related to the work of Han [5] and Iriyama [6] who studied the dual problem of rates 
required to meet a specified error exponent or a specified correct decoding exponent. 

The paper is organized as follows. Section [n] relates our problem to a modification of Campbell's 
compression problem [7j. Section [III] gives bounds on the limits of exponential rate of guessing moments, 
in terms of information spectrum quantities. Section [IV] evaluates the bounds for some specific examples. 
Section [V] concludes the paper with additional remarks. Proofs are given in the appendices. 



II. Guessing with key-rate constraints and source compression 

In this section, we make a precise statement of our problem, and establish a connection between guessing 
and source compression subject to a new cost criterion. 

Let X™ denote the set of messages and .M(X n ) the set of PMFs on X™. By a source, we mean a 
sequence of PMFs (P n : n G N), where! P n G Ai(X n ). Let X n denote a message put out by the source 
and U k the secure key of k purely random bits independent of X n . Recall that the transmitter computes 
the cryptogram Y = f n (X n , U k ) and sends it to the receiver over a public channel. 

2 Sometimes we use Px n in place of P n when we refer to the distribution of the random vector X n . 
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For a given cryptogram Y — y, define a guessing strategy 

G n {-\y): X"->{l,2,...,|X| n } 

as a bijection that denotes the order in which elements of X n are guessed. G n (x n \ y) = I indicates 
that x n is the /th guess, when the cryptogram is y. With knowledge of P n , the encryption function f n , 
and the cryptogram Y, the attacker can exhaustively calculate the posterior probabilities of all plaintexts 
Px n \y(- I y) given the cryptogram. The attacker's optimal guessing strategy is then to guess in the 
decreasing order of these posterior probabilities Px n \r{- I y)- Let us denote this optimal attack strategy 
as Gf n . The key rate for the system is R = fc(m2)/n nats of secrecy per source letter. Let (f n : n G N) 
denote the sequence of encryption functions, where N denotes the set of natural numbers. This sequence is 
known to the attacker. We assume that the attacker employs the aforementioned optimal guessing strategy. 
For a given p > 0, key rate R > 0, define the normalized guessing exponent 

E*(R,p) :=supilnE[G /n (X"|yy]. 
U n 

The supremum is taken over all encryption functions. Further define performance limits of guessing 
moments as in |[2]|: 

EZ(R,p):=UmmvE°(R,p) (2) 

n— >oo 

Ef {R,p):= lim inf E 9 n {R,p). (3) 

71— >CJO 

We next define the related compression quantities. A length function L n : X n — > N is a mapping that 
satisfies Kraft's inequality: 

exp 2 {-L n (x n )} < 1, 

where the code alphabet is taken to be binary and exp 2 {a} = 2 a . (We shall use exp to denote the inverse of 
the natural logarithm ln). Every length function yields an attack strategy with a performance characterized 
as follows. 

Proposition 1: Let L n be any length function on X". There is a guessing list G n such that for any 
encryption function /„, we havej 

G n (x n \y) < 2exp 2 {min{L n (x ri ),n J R/(ln2)}} 
= 2exp{min{L„(x")ln2,n J R}}. 

Proof: We use a technique of Merhav & Arikan [2J. Let G^ n denote the guessing function that 
ignores the cryptogram and proceeds in the increasing order of L n lengths. Suppose Gl„ proceeds in 
the order x", ■ ■ ■ . By [8, Prop. 2], we need at most exp 2 {L n (x n )} guesses to identify x n (This is a 
simple consequence of the fact that there are at most exp 2 {L n (x n )} strings of length less than or equal 
to /-,,(./"')). 

As an alternative attack, consider the exhaustive key-search attack defined by the following guessing list: 

fn 1 {y^Jn 1 (y, «£),••• 

where Ui,u 2 , ■ ■ ■ is an arbitrary ordering of the keys. This strategy identifies x n in at most exp{ni?} = 
exp 2 {ni?/(ln2)} guesses. Finally, let G n (- | y) be the list that alternates between the two lists, skipping 
those already guessed, i.e., the one that proceeds in the order 

ar?,/- 1 (y ) tt}) ) a:J,/- 1 (y ) «*),-.. . (4) 
Clearly, for every x n , we need at most twice the minimum over the two individual lists. ■ 



3 We reiterate that R is measured in nats. 
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We now look at a weak converse in the expected sense to the above. We first state without proof the 
following lemma which associates a length function to any guessing function (see (H Prop. 1]). 
Lemma 2: Given a guessing function G n , there exists a length function L Gn satisfying 

L Gn (x n ) - 1 - log 2 c n < log 2 G n (x") < L Gn (x n ), (5) 

where 

Cn = V -• 

i=l 

For a proof, we refer the reader to jH Prop. 1]. We then have the following proposition. 
Proposition 3: Fix n £ N, p > 0. There is an encryption function f n and a length function L n such 
that every guessing strategy G n (and in particular Gf n ) satisfies 



E [G(X n | Y) 



* (o J9-L ^ [exp{pmin{L TO (X")ln2,^}}]. 

Proof: See Appendix [A] The proof is an extension of Merhav & Arikan's proof of [2, Th.l] to 
sources with memory. The idea is to identify an encryption mechanism that maps messages of roughly 
equal probability to each other. Our proof also suggests an asymptotically optimal encryption strategy for 
sources with memory. ■ 
Remark 1: Note that c n < 1 + nln |X|, so that 

^,0^=0(1), (6) 
n \ n J 

a fact that will be put to good use in the sequel. □ 
Propositions \T\ and [3] naturally suggest the following coding problem: identify 

E°(R,p) := min-lnE[exp{pmin{L n (X n )ln2,n J R}}] . (7) 

The minimum is taken over all length functions. We may interpret the cost of using length L n (x n ) as 
exp {min{L n (a; n ) In 2, nR}}, i.e., the cost is exponential in L n , but saturates at exp{nR} and so all 
lengths larger than nR nats (i.e., ni?/(m2) bits) enjoy the saturated cost. Then E*(R,p) is the minimum 
normalized exponent of the pth moment of this new compression cost. In analogy with © and §3§ we 
define 

E s u (R,p) = ]imsuvE s n (R,p) 

n— ¥00 

E?(R,p)=limir i fE s n (R,p) 

n— >oo 

The following is a corollary to Propositions Q] and [3l and relates E^(R, p) and E s n (R,p). 
Corollary 4: For a given R, p > 0, we have 

\ K (R, P) - K{ RM< 1 ^^. (8) 

n 



5 



Proof: Let L* be the length function that achieves E^(R,p). Using Proposition [TJ and after taking 
expectation, we have the guessing strategy G n that satisfies 

E [exp {pmm{L* n (X n ) In 2, nR}}} 

> sup ^E [G n (X n | Y) p ] 

fn & 

> sup^E[G /n (X«|rn 

fn 

- (4c w )p(2 + p) E [eXP {p min ln 2 ' nR ^ 

for some /„, and L n , given by Proposition [3l 

" (4c n )^(2 + p) E [eXP {P ^ {L:{Xn) ln 2; ^ }}] ' 

Take logarithms, normalize by n, use c n > 1 and p > to get ([8]). ■ 
We now state the equivalence between compression and guessing. 

Theorem 5 (Guessing-Compression Equivalence): For any p > and R > 0, we have E*(R,p) = 
E°(R,p) wdE?(R,p) = E?(R,p). 

Proof: From Corollary @] and ©, magnitude of the difference between E^(R, p) and E^(R, p) decays 
as 0((lnn)/n) and vanishes as n — > oo. ■ 

Thus, the problem of finding the optimal guessing exponent is the same as that of finding the optimal 
exponent for the coding problem in ©. When R > ln|X|, the coding problem in © reduces to the 
one considered by Campbell in 0; this is a case where perfect secrecy is obtained and is studied in J8]|. 
Proposition Q] shows that the optimal length function attaining the minimum in © yields an asymptotically 
optimal attack strategy on the cipher system. Moreover, the encryption strategy in the proof of Proposition 
[3] (see Appendix El) is asymptotically optimal, from the designer's point of view. 

In the rest of the paper we focus on the equivalent compression problem and find bounds on E^ and 



III. Growth Exponent for the Modified Compression Problem 



We begin with some words on notation. Recall that AA(K n ) denotes the set of PMFs on X". The 
Shannon entropy for a P n E M(K n ) is 

H(P n ) = - Pn(x n )\nP n (x n ) 
and the Renyi entropy of order a ^ 1 is 

H a (P n ) = In ( PnW) ■ (9) 

\x n £X n / 

The Kullback-Leibler divergence or relative entropy between two PMFs Q n and P n is 

[ oo, otherwise, 

where Q n P n means Q n is absolutely continuous with respect to P n . We shall use (X n :n£N) 
to denote a sequence of random variables on X™, with corresponding sequence of probability measures 
denoted by X := (Px n : n EN). Thus X is a source and X n its n-letter message output. Abusing notation, 



we let .M(X N ) denote the set of all sequences Y = (Pyn : n G N) of probability measures, and for each 
B := (B n C X n : n G N), we define 



MCB) :=(yg M(X) : lim P Y «(B n ) = l) 

I n— ¥00 ) 



In the rest of this section X is a fixed source. For any Y G Ai(B) and p > 0, define 

P u (Y,X,p) := limsup-{pF(Pyn) - D(Pyn || P X n)} 

n— >oo n* 

and 

Ei{Y,X,p) := liminf -{pP(Pyn) - P(Pyn \\ P X n)}. 

We next state a large deviation result that plays a key role in the derivation of bounds on P* and Ef. 
Proposition 6: For all p > and B = (P n C X™ : n G N), we have 

(1 + p) lim sup -In V P|^(a; n ) = max E u (Y,X,p) (10) 
(1 + p) liminf -In V P^(x n ) = max P,(Y,X,p) (11) 

n-s-oo 77 ^ YeMfB) 

The maximum-achieving distribution in (flOl) and (fTTT) is the source X* = (P£ n : n G N) given by 

i 

p 1+p (.) 

pu-)= — — • (l2) 

Proof: See Appendix El ■ 

Remark 2: This proposition is a generalization of Iriyama's J6l Prop. 1], which is obtained by setting 
p = 0. " " □ 

A. Upper Bound on P* 

We first obtain an upper bound on E*. We use Ex»['] to denote the expectation with respect to 
distribution P X n. 

Proposition 7 (Upper Bound): Let R > and p > 0. Then 



(p-9)R+ max P U (Y, X, 
Ye.MC"* 



P*(P,p) < min 

Proof: We first recall the useful variational formula [9, Prop. 1.4.2] 

lnE X n [exp{U{X n )}} 



SUp{Eyn[P(Y n )] - P(Pyn || P xn )} (13) 

Pyn 
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for any U : X n — > R, where R denotes set of real numbers. For notational convenience, let d{Y n ) := 
D(P Y n || Px«)- Observe that 

In E X n [exp {p min{L n (X n ) In 2, nP}}] 

= sup[pEyn [min{L n (r n )ln2,nP}] - c/(F n )] (14) 

< sup[pmin{Ey» [L n (F n )ln2] ,nR} - d(F n )] (15) 



sup <^ min [(p - #)nP + #E Y n [L n (Y n ) In 2] 

p y „ ^ 0<6»<p 



- d{Y n ) (16) 



min sup { (p - #)nP + #Eyn [L n (Y n ) In 2] 



- d(Y n ) (17) 



min <^ (p - #)nP + sup < #Eyn \LJY n ) In 21 



- d(Y r 

In the above sequence of inequalities, (fl4l) follows from the variational formula (IT31) with 

U(x n ) = pmin{L n (x n )ln 2, nP}. 

Inequality (fT5l) follows from Jensen's inequality because min{-, nP} is concave for a fixed nR. Equality 
([TBI follows from the identity 

pminja, 6} = min {9a + (p — 9)b\. 

o<e< P 

Equality (fTTT) follows because the term within braces is linear in 9 for a fixed Pyn, concave in Pyn for a 
fixed 0, and the sets [0, p] and .M(X n ) are compact and convex; these permit an interchange of sup and 
inf, thanks to a minmax theorem ITTOl Cor. 2, p. 53]. Taking inf over L n , and interchanging the inf over 
L n and the min over 9, we get 

inf ln E X n [exp {p min{L n (Y n ) ln 2, nR}}} 



< min <^ (p - #)nP + inf sup 4 0E yn [I- n (V n ) In 2] 

0<f<P L in Pyn 



- d(Y n ) 

min j (p - 9)nR + sup { inf Eyn [L n (F n ) ln 2] 

>< <P [ Pyn [ i" 



min ^ (p - 9)nR + sup < 9H(P Y r 

0<9<P [ Pyn I 



+0(1) (18) 



d(Y n ) +0(1) (19) 



min ^ (p - #)nP + 9H^(P X n) + 0(1) k (20) 

0<6Kp I 1+8 1 
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Equality (fTST ) follows because the function inside the inner braces is concave in Pyn, asymptotically 
linear in L n (see proof of (H Prop. 6]), and Ai(K n ) is compact; this allows us to interchange inf and 
sup. Inequality ([191 ) follows because inf of expected compression lengths over all prefix codes is within 
In 2 nats (1 bit) of entropy. The last equality follows from the well-known variational characterization of 
Renyi entropy, 

sa-p{9H(Pyn) - D(P Y n || P X n)} = 6H^(P Xn ), (21) 

Pyn 1+9 

a fact that can also be gleaned from the variational formula (TOT) . Divide both sides of (|20l) by n and take 
limit supremum as n — > oo to get 



( 9 

< lim sup min l(p-6)R + -H_^(P X «) 

n->oo o<e<p { n i+e 

< min Up- 6)R + 9\imsvLT>-H i (P x « 
o<e< P { „^oo n i+fi 

= min \(n-0)R+ max E U (Y,X,9)\ , 
o<e< P { yr yem(xn) J 

where the last inequality follows from Proposition [6] and the formula for Renyi entropy. This completes 
the proof. ■ 
From the above proof it is clear that the upper bound holds with equality, when Jensen's inequality 
holds with equality in (fT5l) . i.e, the random variable (1/n) min{L n (X n ) ln 2, nR] tends asymptotically to 
a constant. This would happen, for example, when normalized encoded lengths concentrate around the 
entropy rate of the source. 



B. Lower Bound on Ef 

We now derive a lower bound on Ef. For a given distribution Pyn arrange the elements of set X n in the 
decreasing order of their Py« -probabilities as done in Sundaresan |[3l Sec. rV]. Enumerate the sequences 
from 1 to |X| n . Henceforth refer to a message by its index. Let Tfi(Y n ) denote the first M = [exp{nP}J 
elements in the list. We denote the probability of this set by Py«, i.e., 

Py« = p y< xn )> 

x n eT R (Y n ) 

and the probability of the complement of this set T^(Y n ) by Fy n . Let the restriction of Pyn to this set 
T R (Y n ) be P yn . Let L* denote the length function that attains E^(R,p) in ©. As the length functions 
are uniquely decipherable we have exp 2 {L* (z)} > i. 

Proposition 8 (Lower Bound): For a given p > and rate R > 0, we have 

Ef (R, p) > max < pR + lim inf - ln F Xn , 

|^ n->oo n 

(l + p)liminf-m Pit p (x n )X. (22) 

n— >oo Tl ^ — ' 

x n &T R (X n ) J 

Remark 3: The first term contains limit infimum of the error exponent for a rate-P source code. The 
second exponent is the correct decoding exponent for a rate-P code when p I 0. □ 
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Proof: The variational formula (fT3l applied to the function U(x n ) = pmm{L n (x n )ln2,nR} gives 
inf In E X n [exp {p min {L n (X n ) In 2, riP}}] 
= inf sup{pEyn[min{L„(F n ) \n2,nR}] - d{Y n )} 

> sup { p inf Eyn [min{L„(X n ) In 2, nil!}] - d(T r 

(23) 

where the interchange of inf and sup yields the lower bound in (|23l . Fix a distribution Pyn and consider 
the first term in (T23T ). Using the enumeration indicated above, we may write 

infEyn [min{L n (r n ) In 2, nR}] 

|X| n 

= ^P Y n(i)mm{L* n (i)\n2,nR} 

i=l 

M |X|™ 

= ^Pyn(i)min{L;(i)ln2,n J R} + Py n (i)nR 

i=l i=M+l 
M 

> ^PYn{i)\nG* n {i)+nRF Y n (24) 

i=l 

M P (i) 

> Pyn V ^——-Lq* (i) ln 2 - ln 2 - ln(l + n ln |X|) 

7=1 byn 

+ nRFy n (25) 

> F Y nH{P Yn ) -ln2-ln(l + nln|X|)+nPP y , l . (26) 

Inequality (|24|) follows because 

L*(i)ln2 > lnz = hxG* n {i) 

with G* the guessing strategy that guesses in decreasing order of Pyn probabilities. Lq* in (1251) denotes 
the length function given by Lemma [2l Inequality (1261) follows from the source coding theorem's lower 
bound. Substitute (|26|) in d23l) . normalize by n, and take limit infimum to get 

E?(R, P ) 

> liminf - sup \pF Yn H{P Yn ) + P£„pnP - d(y n ) 1. 

Pyn, may be thought of as a triplet made of P y „,Pyn, and the restriction of Pyn to TJj(F"). We now 
perform the optimization 

sup {pF Yn H(Py n ) + F Yn pnR - d(Y n )} (27) 

Pyn 

in four steps. 

Step 1: We first optimize over permutations of probabilities over strings. Pyn, F yn , H(Pyn), and H(P Y „) 
remain unchanged over these permutations. Observe that 

-d(Y n ) = p(Pyn) + J2 p y<y n )^PxAy n ), 

y n 

and so the maximum for — d(Y n ) is attained when the permutation that orders Px n {-) in decreasing order 
also orders Py«(-) in decreasing order. In particular, Tji(Y n ) equals Tji{X n ). 
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Step 2: We now optimize over restriction of Pyn to T^(Y n ). For a fixed Fyn, the log-sum inequality 
yields 

£c™eT£(x™) ^ v 1 x 

with equality if and only if Pyn (a;™) = P x ™(x n )^ for all x n G T%(P X n). 
Step 3: To optimize over P yn rewrite (1271 ) as 

sup \ pF Y nH(P yn ) + F Yn pnR 



^ Py " Wln P^-^^ Wln P^) 

i=l A u M+l v y 



sup <^ pF Y nH{P Y „) + FynpnR 

' n ,F Y n (. 



-E^-m^I^-^IR <28) 

~7 Px«W ^x"J 

sup <^ pF Y ™H(Py n ) + FynpnR 

-F Y nD(Py n || B ) - D(Pyn||P Xn )| (29) 

SUP < pFynH 1 (Pyn) + Fy n pnR 

Fyn { 1+P 

-D(F Y n || Fx»)l. (30) 



Equality (|28T ) is obtained by substituting the attained lower bound in Step 2. In (|29l , P y „ and Py„ denote 
conditional distributions of Pyn and Pyn given TR(F n ) and Tji(X n ), respectively, where Tr(Y u ) = 
Tfi(X n ) as argued in Step 1. D(F Y n \\F Xn ) denotes the divergence between binary random variables 
whose probabilities are {Fy«,l — Fyn,} and {Fx*, I — Fx™} respectively. Finally we used variational 
characterization of Renyi entropy given in (I2TI) to arrive at (|30l> . 

Step 4: We now optimize over Fyn e [0,1]. Let Z be a binary random variable defined as 

pH i (Fyn) with probability Fyn , 

! + P 



pnF with probability 1 — Fyn 

By E Fyn [Z] we mean the expectation of Z with respect to the above distribution. Since Z is a positive 
random variable, the variational formula yields 

sup {E Fyn [Z] - F(Fyn || Fyn)} = lnE Fxn [exp{Z}] . 
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Continuing with the chain of equalities from (|30l) we get 

sup i F Yn pH^(P' Xn ) + F yn pnR - D(F Y n \\ F x 

{/ M . \ l +P 

F xn exp{nRp} + F X n ^ P x ^(i) 



In { F£„ exp{nRp} + ]T P^+" (i) \. (31) 



,i=i 



Finally normalize both sides of (f3TT) by n, take limit infimum, and apply [11, Lemma 1.2.15], which states 
that the exponential rate of a sum is governed by the maximum of the individual terms' exponential rates, 
to get the desired result. ■ 

In the subsequent subsections we further lower bound each of the two terms under max on the right- 
hand side of (1221) . For an arbitrary source we first recall the source coding error exponent. We also identify 
the growth rate of sum of exponentiated probabilities of the correct decoding set. We then relate these to 
the terms in the lower bound obtained in (|22j) . We largely follow the approach and notation of Iriyama 
J6l, which we now describe. 

Given X = (Px™ ■ n & N) and Y = (Pyn : n G N), we define the upper divergence D u (- || •) and lower 
divergence Di(- || •) by 

D U (Y || X) := limsup -D(P Y n || P X n) 



n 



DAY || X) := liminf -D(Py» || P x «). 

n— >oo n 

For a Y = (Pyn : n e N), denote the spectral sup-entropy-rate [5, Sec. II], [fT2ll as 

H(Y) := mf<6: lim Pr { - In \ > 9 

V ' \ n->oc \n Pyn(F n ) 

and the spectral inf- entropy -rate as 



H(Y) := sup { 6 : lim Pr { - In ]— < 9 } = 



n PY"(Y r > 

Also define, as in [6, Sec. II], the following quantity which determines the performance under mismatched 
compression: 

R(Y, X) := sup { 9 :lim Pr { - In - — < 9 \ =0 



1 ) Decoding Error Exponent: In this subsection we recall the decoding error exponent for fixed-rate 
encoding of an arbitrary source. We identify the first term in (1221) as composed of the exponent of minimum 
probability of decoding error, and obtain a lower bound for it, or alternatively an upper bound on the 
error exponent. This is made precise in the following definitions. 

By an (n, M n , e n )-code we mean an encoding mapping 

n :X"^{l,2,---,Mj 

and a decoding mapping 

^:{l,2,...M n }->X" 
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with probability of error e n := ~Pi{ip n ((f) n (X n )) ^ X n }. R is r-achievable if for all rj > there exists a 
sequence of (n, M n , e n )-codes such that 

lim sup — In — > r (32) 



n e r 



lim sup- In M n < R + rj. (33) 

The infimum fixed-length coding rate for exponent r is 

i?(r|X) = inf{i? : R is r-achievable}. 
On the other hand, the supremum fixed-length coding exponent for rate R is 

E(R\X.) = sup{r : R is r-achievable}. 

See Iriyama [6] and Han lfT2l Sec. 1.9] for a pessimistic definition for fixed rate source coding, i.e., the 
liminf in place of limsup in (l32l) . See also Iriyama & Ihara [fT3l for both the pessimistic and optimistic 
definitions. These works obtained bounds on the infimum coding rate. In particular, Iriyama Eqn. (13)], 
Iriyama & Ihara [13, Eqn. (12)] obtained lower bounds on the infimum coding rate i?(r|X) under the 
optimistic definition, the definition of interest to us. We however work with the error exponent, and obtain 
an upper bound on supremum coding exponent. This suffices to lower bound the first term in (1221) . 
Clearly, M n = [exp{nR}\ satisfies (l33l) . and with 

r = lim sup — log 



n ° F c xr 



R is r -achievable. It follows from the definition of E{R\X) that 



lim sup- In — !— < E(R\X) 
n F^n 



so that 

liminf- In Ft„ > -E(R\X). 

The following proposition upper bounds the supremum coding exponent. 
Proposition 9: For any rate R > 0, 

E(R\X) < inf DJY II X). (34) 

Y:H_(Y)>R 

Proof: See Appendix O ■ 
Remark 4: When R > In |X|, the probability of decoding error e n = 0, so that E(R\~K) = +oo. The 

right-hand side is an infimum over an empty set and is +oo by convention, and the proposition holds for 

such R as well. 

One can also show the alternative bound 

E(R\X) < inf DJY II X). (35) 

Y:R(Y,X)~D U (Y||X)>_R 

See the end of AppendixOon how to prove this. This result would be the functional inverse of Iriyama's (6l 
Eqn. (13)], while Proposition |9] is the functional inverse of Iriyama & Ihara's [fT3l Eqn. (12)]. Proposition 
|9l as we will soon see, provides a more natural extension of Arikan & Merhav's expression for E(R,p) 
to general sources. □ 
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2) Correct Decoding Exponent: We now study a generalization of the exponential rate for probability 
of correct decoding. 

For a given (n, M n , e n )-code, let 

A n := {x n e X n : MM^)) = 

denote the set of correctly decoded sequences. For a given p > 0, R is (r, p)-admissible if for every 77 > 
there exists a sequence of (n, M n , e n ) -codes such that 

(1 + p)liminf-ln V Pit p (x n ) > r (36) 

n— »oo Tl — ' 

limsup-lnM n < R + rj. (37) 

Unlike the exponent for the probability of error, here r can be positive or negative. The infimum fixed-length 
admissible rate for a given r and p > is 

i£*(r, p|X) = inf{i? : i? is (r, p)-admissible}. 

It is easy to see that the set {R : R is (r, p) -admissible} is closed and so R*(r, p|X) is (r, p)-admissible. 
The supremum fixed-length coding exponent for a given i? and p is 

E*(R, p|X) = sup{r : is (r, p) -admissible}. 



Remark 5: The choice of limit infimum in (1361) makes the definition of admissibility pessimistic. For 
p I 0, the above definitions reduce to the special case of exponential rate for probability of correct 
decoding (see O Sec. 1.10]). □ 

Clearly, A n should be Tr(X") to maximize the left-hand side of (|36l) . and hence 



E*(R,p\X) = (l + p)liminf-ln V Pit" 



n— >oo 77, z — ' 

The following proposition gives an expression for E*(R, p|X) and generalizes [6, Thm. 4] to any arbitrary 
p > 0. En route to its derivation we find the expression for R*(r, p|X). 
Proposition 10: For any p > 0, we have 

R*(r,p\X) = inf H(Y) (38) 

Y:B i (Y,X,p)>r 

£*0R,p|X)= ^up Ei(Y,X,p). (39) 

Y://(Y)<_R 

Proof: See Appendix [Dj ■ 

C. Summary of Bounds on E^ and Ef 

We now combine Propositions ITlfTOl of the previous subsections to obtain the main result of the paper. 
Theorem 11: For a given p > and R > 0, 



max <^ pR- inf DJY II X), 

[ Y :; ff(Y)>K 

_sup Ei(y,X,p) 

Y:H(Y)<R 

< E s l {R ) p)<E s u {R )P ) 

< mm I (p - 6»)i? + max £ U (Y, X, 0) } . (40) 
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Proof: The last inequality was proved in Proposition |7J Proposition [8] indicates that 

> max < pR + lim inf — In P£ n , 

n->oo n 



(1 + p) lim inf -In V Pll p (x n )\ 

n— >oo n — d I 

> maxjpP- £(P|X),£*(P,p|X)} (41) 



> max <j pR - inf D U (Y || X) 

Y:iJ(Y)>.R 



^up E l (Y,X,p)\, (42) 

Y:H(Y)<i? J 

where (|4~TI) follows from the lower bound on i£(P|X) and the definition of E*(R, p|X), and (|42|) from 
Propositions [9] and [TOl ■ 



IV. Examples 

In this section we evaluate the bounds for some examples where they are tight, and recover some known 
results. 

Example 1 (Perfect Secrecy): First consider the perfect secrecy case, for example, R > In |X|. Because 
of Remark H] and because we may take 9 = p in the upper bound in (|40l) . the limiting exponential rate of 
guessing moments simplifies to 

sup#,(Y,X,p) < Ef(R,p) 

Y 

< E s u (R,p)<maxE u (Y,X,p). 

On account of (fTTI) in Proposition [6l sup in the left-most term is achieved. From Proposition [6l upper and 
lower bounds are p times the liminf and limsup Renyi entropy rates of order In a related work we 
proved in [8, Prop. 7] that whenever the information spectrum of the source satisfies the large deviation 
property with rate function /, the Renyi entropy rate converges and limiting guessing exponent equals the 
Legendre-Fenchel dual of the scaled rate function Ii(t) := (1 + p)I(t), i.e., 

E s u (R,p) = Ef(R,p) = sup{pt - h(t)}. 

In the next examples, we consider the case R < In |X|. 

Example 2 (An iid source): This example was first studied by Merhav & Arikan [2L Recall that an 
iid source is one for which P n (x n ) = YYi=iPi( x i)> where Pi denotes the marginal of X±. We will now 
evaluate each term in (l40l) . 

We first argue that 

inf DJY II X) = inf D(P Y \\ Pi). (43) 

Y:H(Y)>R P Y :H(P Y )>R 

To prove that the left-hand side in (|43l is less than or equal to the right-hand side, let P Y G A4(X) be 
such that H(P Y ) > R. Construct an iid source Y = (P Y « '■ n G N) such that P Y = P Y for all 1 < i < n. 
The iid property easily implies that 

D U (Y || X) = D(P Y || Px), 
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and the law of large numbers for iid random variables yields 

H(Y) = H(P Y ) > R. (44) 



From (1441) . we have that the infimum on the left-hand side of (1431) is over a larger set. We can therefore 
conclude that "<" holds in (l43~l) . 

To prove ">" in (03]) we use the result (see 021 Th. 1.7.2]) 

H(Y) < Ht(Y) := liminf-F(P y ™) 
to get that the infimum over a larger set is smaller, i.e., 

inf DJY II X) > inf DJY II X). (45) 

Y:H_(Y)>R Y:Hi(Y)>R 

Because of (1431 ) it is sufficient to prove 

inf DJY || X) > inf D(P Y \\ P 1 ). (46) 

Y:H ; (Y)>iJ P Y :H(P Y )>R 

Let Y be such that Hi(Y) > R. Construct a source Y such that, Py. = P Y . for 1 < i < n and 

Y±, Y2, • • • ,Y n are independent. Let Z be another source such that Z±, Z 2 , • • • , Z n is an iid sequence with 
distribution 

1 n 

P Zj = ~J2 P y^ J = l,2,-"-,n. 
n ^-^ 

i=l 

As the marginals of Y n and Y n with independent components are the same, it easily follows from the 
formula for Kullback-Leibler divergence that 

D(P Yn || P xn ) = D(P Y n II P Yn ) + D{P Yn || P xn ) 

> D(Py n II P Xn ) 

n 
i=l 

> nD(P Zl || P x ), (47) 

where (|47T) follows from the convexity of divergence. From the concavity of Shannon entropy, we also 
have 

n 

H{P Yn ) < H{P Yl ) < nH(P Zl ). (48) 

Normalize by n take limsup in (|47]) and liminf in (@8]) to get D U (Y \\ X) > D(P Zl \\ P x ) and H(P Zl ) > R 
for a P Zl that is a limit point of the sequence Y^i=i ^Y v n £ N). From these we conclude that (|46l 
holds. This proves (|43l) . 

Following a similar procedure as above, we can bound the other terms in (l40l) for an iid source as 



and 



^up Ei(Y,X,p) 

Y:H(Y)<R 

> sup { P H(P Y ) - D(P Y || Pi)} (49) 

P Y :H(P Y )<R 



supE u (Y,X,e) = sup{6H(P Y ) - D(P Y || P 1 )}. (50) 

Y P Y 
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Substitution of (03) and (@9) in the lower bound of (00]) yields 



Ef(R,p)>msx\pR- inf D(P Y \\ P 1 ), 

I P Y :H(P Y )>R 
sup { P H(P Y )-D(P Y || Px)} 

Py://(Py)<i? 

= sup{pmm{H(P Y ),R}-D(P Y \\ P 1 )} . (51) 



Similarly substitution of (1501) in the upper bound of (|40l) yields 



< mm <j (p - #)P + sup{## (Py) - D(P Y \\ Pi)} 



sup { min {(p - 9)R + 9H(P Y )} - D(P Y || P x ) 
p Y { o<e< P 

(52) 

sup{pmin{P(P y ),P}-P(P y || P x )}, (53) 



where the interchange of sup and min in (1521) holds because the function within braces is linear in 9 and 
concave in P Y . From (|5D) and (1531) . we recover Merhav & Arikan's result (OQ) for an iid source [2, Eqn. 
(3)]. 

Example 3 (Markov source): In this example we focus on an irreducible stationary Markov source 
taking values on X and having a transition probability matrix ir. 
Let .M S (X 2 ) denote the set of stationary PMFs defined by 

M s (X 2 ) = {Q € M (X 2 ) : 

<5(xi,x) = ^ Q(x,x 2 ),Vx e x|. 

Denote the common marginal by q and let 



v(- 1 ^i) := 



Q(x 1 ,-)/q(x 1 ), ifg(xi)^0, 
1/|X|, otherwise. 



We may then denote Q = q x 77, where g is the distribution of X\ and 77 the conditional distribution of 
X 2 given Xl. Following steps similar to the iid case, we have 

E s u = Ef = sup \pmm{H(r] \ q),R} - D(r) || vr | q)\, 

where 

H (V I 9) := ^2<l(x)H(v(- I a:)). 

zgX 

is the conditional one- step entropy, and 

D(ri || 7r I g) = q(x 1 )D(r](- \ xi) \\ ir{- | £1)). 

For a unifilar source the underlying state space forms a Markov chain and the entropy and divergence of 
the source equals those of the underlying Markov state space source Ifl4l Thm. 6.4.2]. The arguments for 
the Markov source are now directly applicable to a unifilar source. 
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V. Conclusion 

We saw the close connection between the problem of guessing a source realization given a cryptogram 
and the problem of compression with saturated exponential costs. The latter is a modification of a problem 
posed by Campbell [7j. Moreover, the exponents for both these problems coincide. This exponent is 
determined by the error exponent and a generalization of correct decoding exponent for fixed length 
block source codes. 

We end this paper with some open questions. 

• The equivalence between guessing and compression exploits the finite alphabet size assumption. Can 
this be relaxed? 

• How do the results of this paper extend to the case with receiver side information? Can the result of 
Hayashi & Yamamoto be extended to general sources? 

• If guessing to within a distortion is allowed, can the result of Merhav & Arikan [15] be extended to 
general sources? Both cases of perfect secrecy and key-rate constrained secrecy remain open. 

Appendix A 
Proof of Proposition [3] 

Let P n be any PMF on X". Enumerate the elements of X n from 1 to |X| n in the decreasing order of 
their P n -probabilities. Let M = exp{nR} denote the number of distinct key strings. For convenience, we 
shall assume that M is a power of 2 so that the number of key bits k = nRj (In 2) is an integer. The 
general case will be easily handled towards the end of this section. 

If M does not divide |X| n , append a few dummy messages of zero probability to make the number of 
messages iV a multiple of M. Further, index the messages from to N — 1. Henceforth, we identify a 
message x n by its index. 

Divide the messages into groups of M so that message m belongs to group Tj, where j = [m/M\, 
and |_-J is the floor function. Enumerate the key streams from to M — 1, so that < u < M — 1. The 
function /„ is now defined as follows. For m = jM + i set 

f n {jM + i,u)=jM + {i®u), 

where i © u is the bit-wise XOR operation. Thus messages in group Tj are encrypted to messages in the 
same group. The index i identifying the specific message in group Tj, i.e., the last k = n.R/(m2) bits of 
m, are encrypted via bit-wise XOR with the key stream. Given u and the cryptogram, decryption is clear 
- perform bit- wise XOR with u on the last nR/(hi2) bits of y. 

Given a cryptogram y, the only information that the attacker gleans is that the message belongs to the 
group determined by y. Indeed, if y E Tj, then 

P n {Y = y} = ^P n {X n eT J }, 

and therefore 

P n {X n = m\ Y = y}= I Pn{x» 6 T,}' V m l lvl \ J> 

! 0, otherwise, 

which decreases with m for m G Tj, because of our enumeration in the decreasing order of probabilities, 
and is for m ^ Tj. The attacker's best strategy Gf n (- | y) is therefore to restrict his guesses to Tj and 
guess in the order jM,jM + 1, • • • ,jM + M — 1. Thus, when x n = jM + i, the optimal attack strategy 
requires i + 1 guesses. 
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We now analyze the performance of this attack strategy as follows. 

E[G fn (X n \YY) 

N/M-l M-l 

= E Y. p ^ xn =3 M+i ^ i+i y 

N/M-l M-l 

^ E E p ^ x " = ^ + 1 ) M - 1 >( z + 1 ) p (54) 

N ' M - 1 M l+P 

> Yl Pn{X n = (j + l)M-l}—- (55) 



N/M-l M-l 

> £ P n {X" = (j + 1)M + i}M' 



j=0 i=0 



AT-1 



P m=M 

where (|54|) follows because the arrangement in the decreasing order of probabilities implies that 

P n {X n = jM + t}> P n {X n = (j + 1)M - 1} 
for i — 0, • • • , M — 1. Inequality (1551) follows because 

A*- 1 M rM M l+P 

Tp' 

Inequality (|56l) follows because the decreasing probability arrangement implies 



M-l M M 

E(* + 1 ) P = E^ / * p dz = j 

i=0 i=l " / ° 



1 A/-1 

P n {X™ = (j + 1)M - 1} > — ^ P„{X™ = (j + 1)M + z}. 



M 

i=0 



(56) 
(57) 



Inequality (1571) follows because we take P„(X ra = m) = for all the further dummy messages with 
indices m > N. Thus (1571) implies that 



7V-1 



P„{X n = m} (min{m + 1, M}) p 

m=0 

M-l N-l 

= P ni xn = m}{m + l) p + p n{ x ' n = m } MP 

m=0 m=M 

< E [G fn {X n \YY\ + (1 + p)E [G fn (X n \YY] 

= (2 + p)E[G fn (X n \YY]. (58) 

Let G be the guessing function that guesses in the decreasing order of P n -probabilities without regard to 
Y, i.e., G(m) — m + 1. Let Lq be the associated length function, given in Lemma [2] Now use (1581) and 
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Lemma [2] to get 



E[G fn (X n \YY) 
> 2^E[(min{G(X"),M})"] 



. / exp 2 {L G (X")} 
mm < -, M 



> — - — E 

~ 2 + p 

" (2c„)p(2 + p) E [eXP {P min ln 2 ' ni? ^ 



(59) 



where the last inequality follows by pulling out 2c n and recognizing that 2c n M > M > exp{nR}. Since 
Gf n is the strategy that minimizes E [G(X n \ Y) p ] , the proof is complete for the cases when k = nRj (ln 2) 
is an integer. 

When ni?/(m2) is not an integer, choose k = |~7ii2/(m2)~|. Then M = exp 2 {/c} > exp{ni?}, and it 
immediately follows that inequality (l59l continues to hold. This completes the proof. ■ 



Appendix B 
Proof of Proposition [6] 

We begin with the following lemma. Recall that M. (X) is the set of all probability measures on X and 
M(B) the subset of M(X) with support set BCX: 

M{B) = {ue M(X) : u{B) = 1}. 

Lemma 12: For any p > 0, p G M(X) and BCX 

(1 + p) ln u,i+p(x) = max {pH(u) - Div II p)}. 



Remark 6: [6, Lemma 1] is the special case when p = 0. □ 

- 

+ ln^p^(x) 



Proo/- Let fi B (x) = 4§U{x G B}. We then have 



; 1 + p) In ^ (s) + In fi(B) 



'1 + p) max < — - — 



1 v(x) In 



D(i/ || n B ) \ +ln//(S) (60) 



;i + p) max J +£>(!/ Hp)} 

u£M(B) 1 1 + p 



-D(x/||/i)j (61) 
max - D(i/ || p)} . (62) 

v&M(B) 
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where (|60l) follows from the variational formula for Renyi entropy of The maximum achieving 
distribution in (|62l ) is p* G M(B) given by 



i 



n*{x) = ^ 1+P V — i{x e B}, 
Y. y eB^ 1+p (y) 

a fact that is easily verified via direct substitution. ■ 
We now prove (fTTI) ; proof of (fTOl) is similar and therefore omitted. We begin by showing "<" in 
(TTTb. Let X* = (P£„ : n G N) G ^(B) be as defined in (fT2l). It is straightforward to verify by direct 
substitution that 

(l + p)ln £ P_^(x n )=pF(P^)- J D(Pl„ 

Normalize by n and take limit infimum, and use the definition of P/(X*, X, p) to get 

(l + p)liminf-ln V P]t' (x n ) 

n-voo n ^ — ' 
x n £B n 

= P ; (X*,X,p) (63) 
< max ErCY.X.p). 

To prove ">" in (fTTI) . let Y = (Pyn : n G N) G .M(B) be an arbitrary sequence. We may assume 
that for all sufficiently large n, Py n <c P x ™ holds; otherwise P;(Y,X,p) = — oo and the inequality ">" 
holds automatically. Define Y* = (P yn : n G N) G M(B) by 

= e s„}. 

r Y n [-Dn) 

It is clear that P yn G Ai(B n ) for every n. From Lemma ITU we have 



+p <x n ) 



(1 + p) In J2 P £ 

x n €B n 

max {pH(P Y n) - P(Pyn || P x „)} 

Pyne7K(B„) 

> pP(P y „) -P(P y „ || P X n). (64) 
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We now study each term on the right-hand side of (|64l) . The entropy term is lower bounded as follows: 

P 



Pyn (B ri 



Pyn (B ri 



{E^">^} 



+ pin Py n (B r , 



+ p\nP Y n(B n ) 

H(Py n )-Py n (B C n )H(Pyn\B C n ) 



+ Pyn(B c n )\nP Yn (B c n ) \+p\nP yn (B n ) 



> p m A H(P Yn )-P Yn (B c n )n\n\X\ 

+ Pyn (B c n ) In Pyn (B c n ) \ + p In Pyn (B 



The divergence term is upper bounded, as in the proof of Iriyama's [6, Prop. 1], as follows: 

P(Pyn || Px n ) 

-\nP Y n(B n ) 



= -\nP Y n(B n ) + ) D(P Yn || P xn ) 

fy n \t5 n ) 

< - In Pyn (P„) + - L D{Pyn || P X n) 

Pyn(P£)-P X n(P£) 



Pyn(B n ) 

< -In Pyn (P n ) + ^ D(Py n || P X , 



(65) 



(66) 



Pyn(B n ) 

To get (l66l) . we used the fact that lnx > 1 — - for all x > and in inequality (1671 ) we used the relation 

Py n {B c n ) - P X u{B c n ) > -1. 
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Substitution of d65]) and ([67]) in ([64]) and the fact that lim n ^ 00 P Y ^(B n ) = 1 yield 

(l + p)liminf-ln V P]k p (x n ) 

> liminf - {pH{P Yn ) - £>(Pyn || P x „) - 0(1)} 

n— s-oo Tl 

= £7,(Y,X,p). 

Since the choice of Y = (Py™ : n e N) G .M(B) was arbitrary, we have proved ">" in (TTT)) . 

From (1631) and (fTTI) . the maximum is attained by X*, the distribution defined in (fl"2l) . This completes 
the proof. ■ 

Appendix C 
Proof of Proposition [9] 

Iriyama & Ihara showed the following lower bound on the infimum coding rate ( [fT3l Th.3, Eqn. (12)]): 

sup H(Y) < R(r\X). (68) 

Y:D„(Y||X)<r 

We claim that (|68l ) is equivalent to (|34l) . This proves the proposition. 

We first show that (1681) implies (|34l) . Fix the source X. Let R be a given rate. Consider an arbitrary 
candidate exponent r and an arbitrary source Y. We argue that 

R is r-achievable and H(Y) > R r < L> U (Y || X). (69) 

Taking the infimum on the right-hand side of (l69l) over Y with H_(Y) > R, and then the supremum over 
r will yield (|34b . 

To argue (|69l ) by contraposition, we shall show that 

r > D U (Y || X) 
^> either R is not r-achievable or H_(Y) < R, 

or equivalently, we shall show that 

r > D U (Y || X) and H(Y) > R 
=^ R is not r-achievable. 

But the conditions on the left-hand side imply 

sup H(Y) > R, 

Y:D u (Y||X)<r 

which together with (1681) yields R(r\X.) > R, and this is the same as saying R is not r-achievable. This 
completes the proof of (1681) => (|34|) . (This direction suffices to prove Proposition [9]). The proof of the 
other direction is analogous. ■ 
To prove the upper bound in (l35l) . we begin with Iriyama's [6, Eqn. (13)], which is 

sup {R(Y, X) - D U (Y || X)} < R(r\X), 

Y:D u (Y\\X)<r 

instead of (1681) . The rest of the proof is completely analogous to the proof of Proposition [9] 
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Appendix D 
Proof of Proposition [TOl 

We use the following notations in this proof. For each B = (B n : n G N) define 

|B| := limsup — In \B n \ 

n— >oo Tl 



and 

S(Y) := < B : lim P Y ^{B n ) = 1 \. 
Note that B G S(Y) ^>YGM(B).We will first prove ([38]>. Define a set 

B(r, p|X) = |B:= (B n : n G N) : 

1 — 1 

(1 + p)liminf-ln V P£T{x n ) > r L (70) 

Then, by definition, 

P*(r,p|X) = inf {|B| : B G B(r, p\X)} . (71) 
Fix a B G B{r, p|X). Proposition |6] then implies 



'1 + p) lim inf -In V Pi 

n-»oo 71 ^— — ' 



n 



= max EiCY, X, p). 

Y:BeS(Y) V 

We can therefore conclude using (1701 ) that the following set equivalence holds: 

B(r,p|X) = |J S(Y). (72) 

£ ; (Y,X,p)>r 

From d7B and f72]) we get 

P*(r,p|X) = infi|B|:BG (J S(Y) 

[ Et(Y,X,p)>r 

= M{\B\:E l (Y,X,p)>r,BeS(Y)} 
inf H(Y), 

Y:_E ; (Y,X,p)>r 

where last equality follows because 

H(Y) = inf {|B| : B G STY)} 



as proved by Han & Verdu [16J. This proves (1381) . 

We now prove (l39l) . We first show that if P is (r, p) -admissible then r < supg^^ Ei(Y, X, p). 
Since P is (r, p)-admissible, definition of P*(r, p|X) and (1381) imply 

P>P*(r,p|X) = inf P(Y), 

i.e., for all 8 > there exists a Y such that 

Ei(Y, X, p) > r and P(Y) < P + 5, 
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which further implies that 

r<_sup £,(Y,X,p). 

H{Y)<R+5 

Since 5 was arbitrary, letting 5 I yields 

r<_sup ^(Y,X,p), 

H(Y)<i? 

and the converse part is proved. 

For the direct part it is sufficient to show that given p, any R with 

r:=_sup £,(Y,X,p), 

H(Y)<R 

is (r, p)-admissible. By choice of r, for all 5 > 0, there exists a Y such that 

E^Y, X, p) > r - 5 and 7T(Y) < 

This implies that 

inf H(Y) < R. 

E;(Y,X,p)>7-<5 



Since 5 was arbitrary, let 5 I and use (1381) to get 



inf H(Y) = R*(r,p\X) 

B ! (Y,X,p)>r 



i.e., is (r, p)-admissible. This completes the proof. 
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